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TIN THE CLAIMS ; 

Please substitute the liillowing claims for the samc-numbcred claims in the application; 
I - (Currently Amended) A method lor text summarization nrndiir.vl hy clustering data 
points with defined quantified relation values between them, said method Ciimprising: 

obtaining a lead value Ibr each data point, wherein said lead value for each data point is 
derivtsd calculated by taking a sum of all relarion values input into said data point pkw wcitfhted 
iOL a frequency of occurrence associated with said data point, 

ranking each data point in a lead value sequence list in descending order of lead value, 
assigning a llrsi; data point in said lead value sequence Ust as a leader of a liret cluster, 
considering each subsequent data point in said lead value sequence list as a leader of a 
new cliLSter if its relatic nship wiUt leaders of each of the previous clusters is less than a defined 
threshold value or as a nember of at least one cluster where its relationship with a cluster leader 
is at least equal to said i.hreshold value, wherein the threshold value is adaptively found for a 
given number ol'clustcis, and 

generating [[all said text summarization of any ofa single document and a collection of 
documents by segmenti ig a given text input comprising said data points into clusters, and 
forming a set of leaders o f said clusters to represent said text summarization. 

2. (Previously Presented) The method of claim I , wherein said quantified relationships 
between data points are any of symmetric and asymmetric quantified relationships. 

3 . (Currently Amended) The method of claim 1 , whcrei n said frequency ofoccurrcnrn 
equals one. 
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4. (Previously Presented) The mclhud of claim J, lurther comprising identifying distinct 
data points using said lead values and said rclaUon values between said data points. 

5. (Previously Presented) The method of claim 1 , fijrther comprising organizing a SCI of data 
paints into a hierarchy of clusters by clustering the data points into sets of small sizes, wherein 
each smaller set is ftmher subclustered; and repeatedly subclustering said smaller set until a 
lurminating condition i s reached. 

6. (Previously Presented) ITie method of claim 1, wherein said step of generating further 
comprises: 

segmenting a yivcn input text into blocks comprising sentences, a coUection of sentence.s, 
and paragraphs, 

excluding word-j belonging to a defined list (?f defined stop words, 
replacing words by their existing unique synonymou.s word from a given a collection of 
synonyms, 

applying siemniing algorithms for mapping words to root words, 

representing resulting blocks of text, witli respect to a dicUonary which is cither given or 
computed from the input text, by a binary vector of size equal tt) the number ol' words in the 
dictionary whose iih element is 1 it' an ith word in tlic dictionary is present in the block, 

computing the relationship between any data points di and dj by evaluating R(di,dj) - 
|di.Tdi|/|di|, wherein T i.s a thesaurus matrix whi>se ijth clement reHects an extent of inclusion of 
meaning of jth word in tlic meaning ofith word, and 

09/8 15,6 16 3 
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clustering the daui points. 

7. (Previously Presented) The method of claim 6, wherein said dictionary is computed by 
taking a fraction of words, excluding said stop words, with a highest tfidf vulue, v^^hich is given 
by: 

tfidf(wi) - tfi * log(N;dli), 

where liidf(wi) is the lead value oldata point wi, tfi - a number of times the daUi point wi 
occurred in a whole text, dfi = a number ofdocumenls containing the data point wi and N - the a 
total number of docun-.ents in the text. 

8 . (Previously Pre seated) The method of claim 6, wherein said thesaurus matrix comprises 
any of a given identity matrix, and a computed matrix from a collection of documents. 

9. (Cun-ently Amtndcd) The method of claim 6, wherein each block i.s represented by a 
vector whti.se ith element represents [[a]J said frequency of occurrence of said ith word in the 
block. 

1 0. (Previously Prcjiented) The method of claim 6, further comprising organizing a set of text 
documents into a hierarchy of clusters by clustering given documents into sets of small sizes, 
wherein each smaller sci is ftuilicr subclustered; and repeatedly subclustcring said smaUcr set 
until a terminating condition is reached. 

1 1 . (Previously Presented) The method of claim 1 0, further comprising organizing results 
09/815,616 4 
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returned by an information retrieval system in response to an user query into an hierarchy of 
clusters. 

1 2. (Previously Presented) 1 he mctliod of claim 1 1 , whei^in the hierarchy h iwed to aid the 
user in any of modilying a query of said user and browsing through said results. 

13. (Previously Prosenled) The method of claim 11, wherein said information retrieval 
system comprises a search engine retrieving Web documents. 

14. (Previously Presented) The method of claim 5, wherein said step of generating is applied 
to vocabulary organizstion for a group of documents wherein the data points are words in a 
dictionary of the vocabulary, wherein the lead value of a word is any of its frequency of 
occurrence in the collection of documents, a number of documents containing the word, and a 
tlidf value of said word, wherein a relationship R(di,dj) denotes a fraction of documents 
containing a jth word t-iai also contains an ith word, atid the clustering of said data points results 
in a structured hierarchical organization of the vocabulary. 

1 5. (Previt)usly Presented) The method of claim 14, wlicrein a structured vocabulary is used 
to provide text sununai ization for asscKiatcd dt)cuments. 

1 6. (Previously Presented) The method ol'claim 14, further comprising applying the 
clustering to customer profiling wherein a dictionary is built and the vocabulary is organized 
using documents that are viewed by a customer. 

09/815,616 5 
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1 7. (rreviouifly Pi esented) The melhod of claim 5, wherein said data points correspond k> 
products cataloged in an cicclronic store, the lead value of a product is its per unit profit its per 
unit value or a number of items sold per unit time, and a relationship between the products is 
cither explicitly delinid or derived Irom purchase data. 

1 8. (Previously Presented) The method of claim 17, wherein a pajJuct di is related to a 
product dj by the a (ruction of customer transactions containtny dj tliat also contain di. 

1 9. (Previously Presented) I hc method of claim 1 7, fiirf her comprising applyi ng the 
clustering to any of to an analysis oJ'salcs of a store for a merchant, and an organization of a 
layout of the store to f acih'tate easy access to products. 

20. (Previously Pn;senled) The method of claim 1 7, further comprising applying the 
clustering to personal! /-e an electronic store layout to an individual customer by using a 
relationship that is spciilic to the individual customer. 



21 . (Previously Presented) The method t)f claim 5, further comprising applying the clustering 
to customer segmcntat on For a sales or service organization wherein the data points comprise 
customers in a database, wherein the lead values arc any of a total purchase ainount per unit time 
ol'said customers, income ofsaid customers, a number of times customers visited an electronic 
store, and a number of iu:nis bought by the customer, wherein a relationship between customers 
is cither explicitly defuicd or derived from some relevant data, with a resulting clustering 

09/815,616 6 
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reflecting a sinictured grouping of customers witli similar performances. 

22. (Previously Presented) The method of claim 21, wherein a customer di is related U) a 
customer dj by a fraction of products bought by dj that arc al<u> b{.)ught by di. 

23. (Currently Amended) A system lor text summarizatioti produced bv clustering data 
points with defined quanlilied relation values between ihem, said system comprising: 

means for obtaining a lead value for each data point, wherein said lead value for each 
data point is de riv e d c;alculated by taking a sum of all relation values input into said data point 
ptes weighted hv a treLjuency of occunencc associated with said data point, 

means for ranking each data point m a lead value sequence list in descending order of 
lead value, 

means for assigning a first data, point in said lead value sequence list as a leader of a lirst 

cluster, 

means for considering each subsequent data point in said lead value sequence hst as a 
leader of a new cluster if its relationship with leaders of each ofihe previous clusters is less than 
a defined threshold val je or as a member of at least one cluster where its relationship wilh a 
cluster leader is at least equal to said threshold value, wherein the threshold value is adaptively 
found for a given number of clusters, and 

means for generating [[aJl said lext suimnarization of any of a single document and a 
collection of documents by segmenting a given text input comprising said dam points mto 
clusters, and forming a set of leaders of said clusters to represent said text summarization. 

09/815,616 7 
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24. (Previously Presented) l^hc system of claim 23, wherein said qxiantified relationships 
between data points are aiiy ol' symmetric and asymmetric quantified reliitionships. 

25. (Currently Amended) The sy^jtem of claim 23, wlicrcin said frequency ^rf oc currence 
equals one, 

26. (Previously Presented) The system of claim 23, fiirther comprising means for identifying 
distinct data points using said lead values and said relation values between said data points. 

27. (Previously Presented) The system of claim 23, fiirther comprising means for organizing 
a set of data points into a hierarchy of clusters using means for clu.siering the data points into sets 
of small sizes, wherein each smaller set is fiirlher snbciustered; and repeatedly subclustcring said 
smaller set until a terminating condition Is reached. 

28. (Previously Prt!sented) The system of claim 23, wherein said means for generating further 
comprises: 

means for segmenting a given input text into blocLs comprising sentences, a collection of 

sentences, and paragraphs, 

means for excluding words belonging to a defined list ofdefmcd stop words, 
means for replacing words by their existing unique synonymous word from a given a 

collection of synonym.*!, 

means for applying stemming algorithms for mapping words to root words, 

means for representing rcsiiUing blocks of text, wilh respect to a dictionary which is 

09/815,616 8 
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eilher given or computed from the input text, by a binary vector ol'size equal lo the number of 
words in the dictionaiy whose ith clcmenl is 1 if an iih word in the dictionary is present in the 
block, 

meiins for conipuling the relationship between any data points di and dj by evaluating 
R(di,dj) = |di,Tdj|/|di|, wherein T is a thesaurus matrix whose ijth element reflects an extent of 
inclusion of meaning of jth word in the meaning of ith word^ and 

means for cliL^lcring the data p<^ints. 

29. (Previously Presented) The system ofclaim 28, wherein said dictionary is computed by 
taking a fraction of words, excluding said stop words, with a highest tfidf value, which is given 
by: 

tfidl<wi) - ifi * log(N/dfi), 

where tfidf(wi) is the Jcad value of data point wi, tfi a number of times tlic data point wi 
occurred in a whole te^ct, dli ^ a number of documents containing the data point wi and N ^ a 
total number of docun^ents in the text. 

30. (Previously Presented) The system ofclaim 28, wherein said thesaurus matrix comprises 
any ofa given identity matrix, and a computed matrix from a collection of documents, 

3 1 . (Currently Amended) The system ofclaim 28, wherein each block is represented by a 
vector whose ith element represents Ifa]] said frequency of occurrence of said ith word in the 
block. 
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32. (Previoasly P;:esentcd) The system of claim 2«, further comprising means for organising 
a set of tcx( documents into a hierarchy of clusters by using means for clustering given 
documents into sets of small sizes, wherein each smaller set is further subclustcred; and 
repeatedly subclustcring said smtillcr set until a terminating condition is reached. 

33. (Previously Presented) The system of claim 32, further comprising means lor organizing 
results returned by an information retrieval system in response to an user query into an hierarchy 
of clu.sters. 

34. (Previoasly Prjscntcd) 1 he system of claim 33, wherein the hierarchy is used to aid the 
user in any of modifying a query of said user and browsing tlirough said results. 

35. (Previously Pn:sented) The sy.steni ofclaim 33, wherein said information retrieval system 
c<mpriscs a .uearch engine retrieving Wch documents. 

36. (Previously Presented) The system ofclaim 27, wherein said mcan.s for generating is used 
ihr vocabulary organization for a group of documents wherein the data points are words in a 
dictionary of the vocabulary, wherein the lead value of a word is any of its frequency of 
occuTTcnce in ihe collection of documents, a number of documents containing the woixl, and a 
tfidf value of said wore, wherein a relationship R(di,dj) denotes the a fraction of documents 
containing a jth word iliat also contains an ith word, and the clustering of said data points results 
in a structured hierarch cal organization of the vocabulary. 
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37. (Previously Presented) The system of claim 36, wherein a structured vocabulary is used 
to provide text sumin:uization for associated documents. 

38, (Previously Presented) The .system of claim 36, further comprising means tor usmg the 
clustering for customer profiling wherein a dictionary Is built and the vocabulary is organized 
using documents that .are viewed by a cusromer. 



39. (Previously Presented) The system of claim 27, wherein said data points correspond to 
products cataloged in ;m electronic store, the lcax;l value of a product is its per unit profit, its pe 
unit value or a number of items sold per unit time, and a relationship between the products is 
cither explicitly defined or derived from purchase data. 



40. (Previously Presented) The system ofclaim 39, whertsin a prt:>duct di is related to a 
product dj by a frfiction ol'customcr transactions containing dj that also contain di. 

41 . (PrcvioiLsly Presented) The system of claim 39, further comprising means for applying 
the clustering to any of an analysis of sales of a store for a merchant, and an organization of a 
layout of the store to facilitate easy access to products. 

42. (Previously Presented) The system of claim 39, further comprising means for applying 
the clustering to personalize an electronic store layout to an individual customer by using a 
relationship that is spcciTic to the individual customer. 

09/815,616 II 
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43, (Previously Presented) The system of claim 27, ftrther comprising meiins for applying 
the cluslering for customer segmentation lor a sales or service organizalion wherein Ihe data 
points comprise custcmers in a database, wherein the lead values arc tiny of a total purchase 
amount per unit time of said customers, income of said customers, a number of times customers 
visited an electronic store, and a number of items bought by the customer, wherein a relationship 
between customers is either explicitly defined or derived from s^omc relevant data, with a 
resulting clustering reflecting a structured grouping of customers with similar pcrformmiccs. 

44, (Previously Presented) The system of claim wherein a customer di is related to a 
customer dj by a fniction of products bought by dj that arc also bought by di. 

45, (Currently Amended) A computer program pn>duct comprising computer readable 
program code stored on computer readable storage medium embodied therein tor text 
summarization produced by clustering data points with defined quantified relation values 
between them, saidco tnputer program product comprising: 

computer rcadiible program code means for obtaining a lead value for each data point, 
wherein said lead vaJuc ibr each data p<)int is d e rived calculated by taking a sum of all relation 
values input into said data point pl«H weighted by a frequency of occurrence associated with said 
data point, 

computer readable program code means for ranking each data point in a lead value 
sequence list in descending order of lead value, 

computer readable program code means for assigning a lirst data point in said lead value 
sequence list as a leader of a first cluster, 

09/815,616 12 
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computer readable program code means for considering each subseqxjcnl data point in 
said lead value sequence lisl as a leader of a new cluster if its relationship with leaders of each of 
the previous clusters is less lhan a defined threshold value or a member of at least one cluster 
where its relationship with a cluster leader is at least equal to said threshold value, wherein the 
threshold value is adaptively found for a given number of clusters, and 

computer readable program code means for generating [[ajj lext summarization of 
any of a single document and a collection of documents by segmenting a given text input 
comprising said data points into clusters, and forming a set of leaders of said clusters to represent 
said text summari/iitii n. 

46. (Previously Prtrsented) llie computer program product of claim 45, wherein said 
quantified reiaiionships between data points arc any of symmetric and asymmetric quantified 
relationships. 

47. (Currently Amended) I he computer program product of claim 45, wherein said 
frequency of occur rence equals one. 

48. (Previously Presented) The computer program product of claim 45, further comprising 
computer readable proj^am code means for identifying distinct data points using said lead values 
and said relation values between said data points. 

49. (Previously Pre.*:entcd) The computer program pn)duct oJ' claim 45, further comprising 
computer readable program code means configured for organizing a set of data points into a 

G9/«15,616 13 
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hierarchy of clusters iising corapultsr readable program code means configured for clustering the 
data points into sets of small sizes, wherein each smaller set is ftirthcr subclustered; tind 
repeatedly subclustering said smaller set until a terminating condition is reached. 

50. (Previously Presented) The computer program product of claim 45, wherein said 
computer readable program code means cc^nfigured for generating further comprises: 

computer readable program code meaiis conligured for segmenting a given input text into 
blocks comprising sentences, a collection of sentences, and paragraplis, 

computer readable program code means configured for excluding words belonging to a 
defined list of defmcd stop words, 

computer read;iblc program code means configured for replacing words by their existing 
unique synonymous word, if il exists, from a given a collection of synonyms, 

computer rcadc^bIe program code means configured for applying stemming algorithms for 
mapping words to root words, 

computer readable program code means configiuxxl for representing resulting blocks of 
text, with respect to a dictionary which is either given or computed from the inpm text, by a 
binary vector of si7e cc|ual to ihe number of words in the dictionary whose ith element is 1 if an 
ith word in the dictionary is present in the block, 

computer rcadaole program code means configured for computing the relationship 
between any data points dt and dj by evaluating R(di,dj) - |di.Tdi|/|di|, wherein T is a thesaurus 
matrix whose ijth clement reflects an extent of inclusion of meaning of jth word in tlie meaning 
of ith word, and 

computer readable program code means configured for clustering tJic data points. 
09/815,616 14 
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5 1 . (Previously Presented) The computer program product of claim 50, wherein said 
dictionaiy is computed by taking a fraction of words, excluding said stop words, wilh a highest 
tfidf va(ue, which is given by: 

tfidttwi) - tfi * log(N/dfiX 

where tfidf(wi) is tlie lead value of data point wi, tfi - a number of times the data point wi 
occurred in a whole text, dfl - a number of documents containing the data point wi and N a 
total number of documents in the text. 

52. (Previously Presented) The computer program product of claim 50, wherein said 
thesaurus matrix com]7rises any of a given identity matrix, and a computed matrix from a 
collection of documents. 

53. (Currently Amended) The computer program product of claim 50, wherein each block is 
represented by a vector whose ith element represents [[aJJ said frequency of occurrence of said 
ith word in the block, 

54. (Previously Prtrseuted) fhe computer program product of claim 50, further comprising 
computer readable pro.^nun code means configured ibr organizing a set of text documents into a 
hierarchy of clusters by using computer readable program code means configured for clustering 
given documents into sets of small sizes, whei-ein each smaller set is further subclusiered; and 
repeatedly subcluslering said smaller set until a terminating condition is reached. 

09/815,616 15 
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55. (Previously Picscnicd) The a^mputer program product of claim 54, fijrther comprising 
computer readable pr4>graiu code means ct)nfigured for organizing results returned by an 
information retrieval ijystem in respoasc to an user query into an hierarchy of clusters. 

56. (Previously Presented) The computer program product of claim 55^ wherein the hierarchy 
is used to aid the user in any of modifying a query of said user and browsing through said resuhs. 

57. (Previously Prssented) I'hc computer program product ofclaim 55, wherein said 
information retrieval ^yslem comprises a search engine retrieving Web documents. 

58. (Previously Pn^sented) The computer program product of claim 49, wherein said 
computer readable program code means configurecl for gcncmling is used for vocabulary 
organization for a group of documents wherein the data points are words in a dictionary of the 
vocabulary, wherein die lead value of a word is itny of its frequency of occurrence in the 
collection of documents, a number of documents containing the word, and a tfidf value of said 
word, wherein a relationship R(di,di) denotes a fraction of documents containing a jth word that 
also contains an iih wcrd, and the clustering of said data points results in a structured hierarchical 
organization of the voi^abulary. 

59. (Previously Presented) The computer program ofclaim 58, wherein a structured 
vocabulary is used to provide text sunimarization fur associated documents. 

60. (Previously Presented) The computer program product ofclaim 58, further comprising 
09/815,616 16 
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compuier readable program ct)de means configured Ibr asmg the clustering tor customer 
profiling wherein a dictionary is built and the vocabulary is organized using docuinenLs that arc 
viewed by a customej*. 

61 . (Previously Piesenled) The computer program product ol claim 49, wherein said data 
poinb corrcspimd to iJroducts cataloged in an electronic store, the lead value of a product is its 
per unit profit, its per unit value or a number of items sold per unit time, and u relationship 
between the products is eitlier explicitly defined or derived from purchase data. 

62. (Previously Presented) The computer program product ol'claim 61, wherein a product di 
is related to a product dj by a fraction of customer transactions containing dj that uLso contain di. 

63. (Previously Presented) The computer program product ofclaim 61, further comprising 
computer rcadable pro gram code means configured for applying the clustering to any of an 
analysis of sales of a store for a merchant^ and an organi/^ition of a layout of the store to facilitate 
easy access to products, 

64. (Previously Pnrsented) The computer program product ofclaim 61 , further comprising 
computer readable program code means configured for applying the clustering to personalize an 
electronic store layout to an individual customer by using a relationship tliat is specific to tlic 
individual customer. 

65. (Previously Presented) The compuier progiam product ofclaim 49, further comprising 
CW/8 15,616 17 
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computer readable program code means configured for applying the clustering for customer 
j;egmcntation for a sales or service organisation wherein the data points comprise customers in a 
database, wherein the lead values are any of a total purchase amount per unit time of said 
Customers, income of said customers, a number of times customers visited an clcclrtmic store, 
and a number of items bought by the customer, wherein a relationship between customers is 
either explicitly dclincd or derived from some relcvanl data, with a resulting clustering reflecting 
u structured grouping of customers with similar performances. 

66- (Previously Prissented) The computer program product of claim 65, wherein a customer 
di is related to a customer dj by a fraction of products bought by dj that are also bought by di. 
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